Prognostic impact of the AML ELN2022 risk classification in patients undergoing allogeneic stem cell transplantation

For most patients with acute myeloid leukemia (AML), an allogeneic hematopoietic stem cell transplantation (HSCT) offers the highest chance of cure. Recently, the European LeukemiaNet (ELN) published updated recommendations on the diagnosis and risk classification in AML based on genetic factors at diagnosis as well as a dynamic adjustment (reclassification) according to the measurable residual disease (MRD) status for the favorable and intermediate risk groups. Validation of the ELN2022 risk classification has not been reported. We retrospectively analyzed 522 AML patients who received an HSCT at a median age of 59 (range 16–76) years. For patients with adequate material available and in remission prior to HSCT (n = 229), the MRD status was evaluated. Median follow-up after HSCT was 3.0 years. ELN2022 risk at diagnosis was in 22% favorable, in 26% intermediate, and in 52% adverse. ELN2022 risk at diagnosis is associated with the cumulative incidence of relapse/progression (CIR), event-free survival (EFS), and overall survival (OS) in the whole patient cohort, as well as the subgroup of patients transplanted in first remission. However, the risk stratification based on the ELN2022 classification did not significantly improve outcome prognostication in comparison to the ELN2017 classification. In our study, the newly added group of patients with myelodysplasia-related gene mutations did not have adverse outcomes. Re-classifying these patients into the intermediate risk group and adjusting the grouping for all AML patients by MRD at HSCT, led to a refined and improved risk stratification, which should be validated in independent studies.


INTRODUCTION
In 2022, an expert panel on behalf of the European LeukemiaNet (ELN) defined and published a revised risk classification system for acute myeloid leukemia (AML), which was established in 2010 and was revised initially in 2017 [1][2][3]. While the distribution into three genetic risk groups at diagnosis-as introduced by the ELN2017 classification -was maintained, some relevant changes were made, reflecting new insights into AML disease biology and risk stratification. Among the major changes, CEBPA mutations categorized as ELN2022 favorable risk are now restricted to in-frame mutations in the basic leucine zipper (bZIP) region, irrespective of them occurring as mono-or biallelic [4][5][6]. Moreover, the former division into high or low FLT3-ITD allelic ratio (AR) was abandoned, allocating all patients harboring an FLT3-ITD to the ELN2022 intermediate risk group, irrespective of the presence or absence of an NPM1 mutation. NPM1 mutations continue to indicate favorable outcomes in the absence of an FLT3-ITD, with the new exception of co-occurring adverse risk cytogenetics, which now indicates ELN2022 adverse risk. Also, the definition of a complex karyotype changed with the exclusion of hyperdiploid karyotypes with multiple trisomies from this group. Already in the ELN2017 risk classification, mutations in the three genes ASXL1, RUNX1, and TP53 have been introduced as new adverse risk prognostic factors. Now additionally, so-called myelodysplasia-related gene mutations, i.e., in the genes BCOR, EZH2, SF3B1, SRSF2, STAG2, U2AF1, or ZRSR2, define ELN2022 adverse risk in the absence of favorable risk genetics. Finally, a 10% variant allele frequency (VAF) threshold has been introduced for TP53 mutations to allocate individuals to ELN2022 adverse risk [3].
Previous work has shown the prognostic power of the ELN risk classifications published in 2010 [7,8] and 2017 [9][10][11][12]. While this was seen irrespective of whether the patients were consolidated by chemotherapy or allogeneic hematopoietic stem cell transplantation (HSCT) [7,9], the separation of outcome curves according to the ELN risk groups seemed to be less pronounced in individuals that received an allogeneic HSCT, thereby strengthening the use of HSCT consolidation in higher-risk AML patients. In contrast to previous ELN risk stratification systems, the feasibility of the latest update published very recently in 2022, remains to be demonstrated, which was the main objective of our study.

PATIENTS AND METHODS Patients and treatment
We retrospectively analyzed 522 AML patients (median age 59, range 16-76 years) who received an allogeneic HSCT between January 2000 and December 2021 at our center and who had adequate diagnostic information available to group them unambiguously into one of the three ELN2022 risk groups. Patients were either treated with standard intensive cytarabine-based chemotherapy or received hypomethylating agents with or without venetoclax as induction treatment. Remissions status at HSCT

Cytogenetics and molecular markers
Cytogenetic analyses at diagnosis were performed using standard techniques of banding and in situ hybridization. Pretreatment genomic DNA was screened for the presence of FLT3-ITD, as well as the mutation status of the genes CEBPA and NPM1 as previously described [9,16]. In patients with adequate samples available, the diagnostic mutation status of 54 genes recurrently mutated in myeloid malignancies was evaluated using next-generation sequencing (Illumina, San Diego, CA, USA) as previously described [17]. ASXL1 mutations at codon 646 were validated by applying a proofreading polymerase-based Sanger sequencing approach [17].

MRD assessment prior to allogeneic HSCT
For patients transplanted in CR or CRi with adequate bone marrow or peripheral blood material acquired ≤28 days prior to HSCT available (n = 229), the MRD status was assessed using digital droplet polymerase chain reaction (PCR) for at least one of the targets NPM1 mutation, BAALC/ ABL1 copy numbers and MN1/ABL1 copy numbers or using quantitative reverse transcriptase PCR for WT1/ABL1 expression levels adapting the previously published cut-offs [18][19][20][21]. Patients with at least one positive test result were regarded as pre-HSCT MRD-positive.

Statistical analyses
Using the Fine and Gray method, cumulative incidence of relapse (CIR) was calculated from HSCT to relapse considering its competing risk non-relapse mortality (NRM), which was calculated from HSCT to death without relapse [22]. Event-free survival (EFS) and overall survival (OS) were calculated from HSCT until death from any cause and relapse or death, respectively, using the Kaplan-Meier method and groups were compared using the log-rank test. For outcome calculations at 3 years after HSCT, the respective 95% confidence intervals (CI) are presented in Supplementary Table S2. Associations with baseline clinical, demographic, and molecular features were compared using the Kruskal-Wallis test and Fisher's exact tests for continuous and categorical variables, respectively. Receiver operating characteristic (ROC) curves were used as graphical plots to depict the predictive value of selected variables. All P values are two-sided. All statistical analyses were performed using the R statistical software platform (version 4.0.2) [23].  Fig. S1).

RESULTS
Clinical and biologic characteristics within the three ELN2022 risk groups Within the ELN2022 adverse risk group, patients were older (P < 0.001, Table 1), more often had secondary or treatmentrelated AML (P < 0.001), had a lower white blood cell count (P < 0.001), and lower blast percentages in the blood (P < 0.001) Fig. 1 Risk distribution and outcomes according to the ELN2022 genetic risk groups at diagnosis. A Transition plot of risk distribution between the ELN2017 and ELN2022 risk stratification systems at diagnosis. B Cumulative incidence of relapse, C Event-free survival, and D Overall survival according to the ELN2022 genetic risk groups in the whole patient cohort (n = 522).
and bone marrow (P < 0.001) than patients with ELN2022 favorable or intermediate risks. While patients with ELN2022 adverse risk also less often received their allogeneic HSCT in CR/ CRi (P < 0.001) and-of the patients transplanted in CR/CRi-less often in second than in first CR/CRi (P = 0.003), the MRD status at HSCT did not differ between the three ELN2022 risk groups (P = 0.40). Gene mutations not included in the ELN2022 risk classification differed significantly between the three risk groups: FLT3-TKD, DNMT3A, and KIT mutations were most frequently observed in ELN2022 favorable risk patients (P = 0.007, P = 0.002, and P = 0.005, respectively), WT1 mutations most frequently in ELN2022 intermediate risk patients (P = 0.02), JAK2 mutations most frequently in ELN2022 adverse risk (P = 0.04), and TET2 mutations less frequently in ELN2022 intermediate risk patients (P < 0.001). The immunophenotype significantly differed between the three ELN2022 risk groups, including a stepwise higher diagnostic burden of the immature CD34+/CD38− cell population with higher ELN2022 risk (P < 0.001, for details, please see Supplementary Information).
Prognostic relevance of the three ELN2022 risk groups In AML patients receiving an allogeneic HSCT, the allocation of patients into the three ELN2022 risk groups resulted in a significantly distinct CIR (P < 0.001), EFS (P < 0.001), and OS (P < 0.001, Fig. 1B Table S2). Also, in multivariate analyses, the ELN2022 risk at diagnosis remained a prognostic factor for all analyzed endpoints (Table 2). Similar results regarding the prognostic relevance of the ELN2022 risk stratification were observed when restricting the analysis to patients transplanted in morphologic remission ( Supplementary  Fig. S2). The best outcome separation by the ELN2022 classification was observed for patients transplanted in the first CR/CRi (Fig.  2). In contrast-although limited by lower patient numbers-in patients transplanted in the second or without CR/CRi, only patients with favorable ELN2022 risk performed better than those with intermediate or adverse risks, and no distinct outcomes were observed between the latter two groups (Supplementary Fig. S3). The ELN2022 also distinguished outcomes among patients younger (CIR P < 0.001, EFS P < 0.001, OS P < 0.001, Supplementary  Fig. S4) and older (CIR P = 0.002, EFS P = 0.01, OS P = 0.10) than 60 years at HSCT, although outcome differences-especially between ELN2022 favorable and intermediate risks -were less pronounced in older AML patients.  The diagnostic qualifiers now introduced into the ELN2022 classification are discussed, and their impact on outcomes is shown in Supplementary Information and Supplementary Fig. S5.

Outcomes according to different genetic characteristics within the three ELN2022 risk groups
To gain further insight into the prognostic significance of included genetic aberrations, the three ELN2022 risk groups were analyzed separately. Within the ELN2022 favorable risk group, patients with core-binding factor (CBF) AML tended to have longer EFS than patients with in-frame CEBPA bZIP or NPM1 mutations (Fig. 3A), but did not differ in CIR or OS ( Supplementary Fig. S6A, B).
Also, within the ELN2022 intermediate risk group, there were no significant outcome differences between AML patients with a high or low FLT3-ITD allelic ratio (0.5 cut-off), t(9;11), or other ELN2022 intermediate risk aberrations (Fig. 3B and Supplementary Fig.  S6C, D).

Refinement of the ELN2022 risk classification
Due to the observed ability of the ELN2022 risk classification to discriminate outcomes compared to the ELN2017 classification and the differences in outcomes within the ELN2022 at diagnosis adverse risk group in the transplant setting, we sought to refine it by introducing two changes. First, since we observed that patients with myelodysplasia-related gene mutations did not have adverse outcomes, we reclassified these individuals as ELN2022 refined at diagnosis intermediate risk at diagnosis.
Second, since the MRD status at HSCT was able to refine outcomes in all three ELN2022 risk groups, we expanded the proposed MRD-adjusted reclassification of the favorable and intermediate groups to all three risk groups. This led us to divide the patient set into three MRD-adjusted risk groups (see also Supplementary Fig. S7 . Comparing the c-statistics of the ELN2022 risk groups and our refined models at diagnosis as well as at HSCT (Fig. 5 and Supplementary Fig. S8), the refined models performed significantly better in predicting relapse (at diagnosis P = 0.007, at HSCT P = 0.001), or an event (at diagnosis P = 0.002, at HSCT P = 0.05) one year after HSCT, while only the refined model at diagnosis (P = 0.02), but not that at HSCT (P = 0.77) performed better in predicting death one year after HSCT. Definitions of the risk models are given in Supplementary Table S3).

DISCUSSION
With an increased understanding of AML biology, improved cytogenetic and molecular characterizations, and the availability of novel therapeutic compounds, adjustments to our prognostic guidance systems are inevitable. Very recently, this has been implemented by the updated ELN2022 risk classification system, which now-in addition to a conventional cytogenetic characterization-takes into account the mutation status of 13 genesseven more than in the ELN classification of 2017 [2]. While numerous studies evaluated various included aberrations separately, to our knowledge no study validated the ELN2022 risk stratification in AML patients. Here we analyzed a cohort of 522 AML patients homogeneously treated with an allogeneic HSCT at our institution. Since this was a retrospective analysis, all patients were diagnosed with AML according to the WHO 2016 classification, and no patients belonged to the newly introduced category MDS/AML (comprising patients with >10% blasts at diagnosis).
Regarding the three ELN2022 risk groups at diagnosis, we observed distinct clinical and biological characteristics associated with ELN2022 adverse risk, including higher age, a higher amount of therapy-related or secondary AML, and different co-mutation profiles. While patients with ELN2022 adverse risk at diagnosis had a lower chance to achieve a CR/CRi prior to HSCT, intriguingly, the likelihood of an MRD-positive or MRD-negative CR/CRi at HSCT did not differ between the three ELN2022 risk groups (Table 1).
With respect to outcomes, the ELN2022 risk classification was able to allocate AML patients into three risk groups with significantly distinct outcomes, which was especially seen in patients younger than 60 years at HSCT, and in patients transplanted in first CR/CRi. However, in its most recent form the ELN2022 risk classification at diagnosis performed inferior in all analyzed endpoints compared to the ELN2017 risk classification (Supplementary Fig. S1).
When we analyzed the three ELN2022 risk groups separately, we observed no significantly different CIR, EFS, or OS between the distinct genetic aberrations characterizing favorable or intermediate ELN2022 risk ( Fig. 3 and Supplementary Fig. S5). Noteworthy, patients with an FLT3-ITD AR higher or lower than 0.5 (as included in the ELN2017 risk classification) did not differ regarding their CIR (P = 0.13), EFS (P = 0.30) or OS (P = 0.30) after HSCT, supporting the removal of the FLT3-ITD AR from the ELN2022 risk stratification. While only a minority of our patient population received FLT3 inhibitors, and although subsequently restricted by patient numbers, outcomes tended to improve in ELN2022 intermediate risk in a subanalysis of AML patients treated in the era of new drugs or within the verum arm of a trial testing an FLT3 inhibitor ( Supplementary Fig. S9). In contrast, the outcomes of patients allocated into the adverse ELN2022 risk group differed significantly, with the best outcomes in patients harboring mutations in the newly included genes BCOR, EZH2, SF3B1, SRSF2, STAG2, U2AF1, and ZRSR2, which now define adverse risk in the absence of a favorable risk aberration. Previous studies that showed a potential adverse prognostic impact of myelodysplasia-related gene mutations all included less than 50% of patients receiving a consolidating allogeneic HSCT [12,24,25]. Furthermore, two independent studies indicated that patients with myelodysplasiarelated gene mutations might have improved outcomes with consolidating allogeneic HSCT, as compared to chemotherapy alone [12,26]. Similarly, in our transplanted patient population, myelodysplasia-related gene mutations did not associate with adverse outcomes when no other adverse risk characteristics were present. One could speculate that an allogeneic HSCT might have the potential to overcome the adverse prognostic impact of myelodysplasia-related gene mutations. This is in line with our previous data indicating that patients with secondary AML only had adverse outcomes after HSCT when ELN2017 adverse-risk genetics are present [27], or that SRSF2 mutations do not associate with adverse outcomes after allogeneic HSCT [28]. A recently published ASH abstract by Rausch et al. also indicated intermediate outcomes in patients characterized as adverse ELN2022 risk due to a myelodysplasia-related gene mutation in two independent cohorts treated within AMLCG or AML-SG study protocols [29]. Subsequently, whether myelodysplasia-related gene mutations really confer adverse outcomes in the context of the ELN2022 classification should be further evaluated. Some important additional diagnostic changes impacting complex karyotypes, NPM1 and TP53 were introduced by the ELN2022 regarding adverse risk, but affected only a few patients in our cohort (please see Supplementary Information).
Importantly, the ELN2022 now allows for adjustment of the assigned risk according to the MRD status during or after therapy: "a patient with favorable risk AML may be reclassified as intermediate risk or vice versa, based on the presence or absence of MRD, respectively" [3]. Following this suggestion, in our set, approximately half of patients with favorable or intermediate ELN2022 at diagnosis risk changed their risk at HSCT (Fig. 4A). This resulted in improved outcomes for MRD-adjusted favorable risk patients, while for MRDadjusted intermediate risk patients, outcomes were comparable to those of with ELN2022 at diagnosis adverse risk.
With these findings, we sought to improve upon the current ELN2022 risk classification in the transplant context by introducing two changes. First, since the myelodysplasia-related gene mutations confer rather intermediate outcomes, we reclassified these patients as ELN2022 refined at diagnosis intermediate risk. This resulted in significantly better prediction of relapse, EFS, and OS ( Next, we intended to improve upon the MRD adjustments. We previously demonstrated that the clinical value of the MRD status at HSCT is dependent on the ELN2017 risk at diagnosis. The higher the genetic risk, the more likely an "MRD-negative" patient relapses after HSCT and, thus, the lower the relative risk of relapse of "MRD-positive" patients in the same risk group [30]. While this remains true when the ELN2022 risk is considered, we still observed that the MRD status at HSCT has a strong prognosisrefining impact in patients within the ELN2022 refined at diagnosis adverse risk group (Supplementary Fig. S10). Subsequently, a ELN2022 refined MRDÀadjusted risk classification, in which we also adjusted the risk within the ELN2022 refined at diagnosis adverse risk group performed superior in predicting CIR and EFS than the originally proposed ELN2022 MRDÀadjusted (Fig. 5D-F and Supplementary Fig. S7E, F).
Apart from the ELN risk classification system, a variety of prediction models have been developed to improve outcome prediction and inform treatment decisions in AML. These include models like the knowledge bank approach introduced by Gerstung et al. to predict remission and relapse rates, but also NRM [31]. The clinical relevance of this model has been validated [32][33][34][35] and shown to be superior to the ELN2017 risk prediction in patients consolidated with chemotherapy [33], but not with allogeneic HSCT [35]. In addition to this approach, the increased use of machine learning and artificial intelligence approaches in outcome prediction will likely further impact AML risk assessment in the future [36].
In conclusion, our study is the first to explore the prognostic significance of the ELN2022 risk groups in AML. While the ELN2022 allows a risk stratification in AML patients undergoing allogeneic HSCT, it did not perform superior to the ELN2017 classification in outcome prognostication. When we refined the ELN2022 classification system by redistributing patients with diagnostic myelodysplasiarelated gene mutations to the intermediate group and expanding the MRD-based reclassification to the adverse risk group, we improved the discriminative power of the ELN2022 risk classification. Further studies are needed to confirm our results, especially regarding the proposed refinements of the ELN2022 risk stratification at diagnosis and concerning the impact of MRD.

DATA AVAILABILITY
The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request. Fig. 5 Outcomes according to the proposed refinement of the ELN2022 risk groups at diagnosis and MRD-adjusted at HSCT. A Transition plot for patients distribution from the ELN2022 at diagnosis risk to the ELN2022 refined at diagnosis . B ROC curves comparison for suffering an event (relapse or death) within 1 year after HSCT between the ELN2022 at diagnosis and the ELN2022 refined at diagnosis risk groups, C Event-free survival according to the three ELN2022 refined at diagnosis risk groups, D Transition plot for patients distribution from the ELN2022 MRDÀadjusted risk to the ELN2022 refined MRDÀadjusted risk at HSCT, E ROC curves comparison for suffering an event (relapse or death) within 1 year after HSCT between the ELN2022 MRDÀadjusted and the ELN2022 refined MRDÀadjusted risk groups at HSCT, and F Event-free survival according to the three ELN2022 refined MRDÀadjusted risk groups at HSCT.